253 research outputs found

    Levenshtein distances fail to identify language relationships accurately

    Get PDF
    The Levenshtein distance is a simple distance metric derived from the number of edit operations needed to transform one string into another. This metric has received recent attention as a means of automatically classifying languages into genealogical subgroups. In this article I test the performance of the Levenshtein distance for classifying languages by subsampling three language subsets from a large database of Austronesian languages. Comparing the classification proposed by the Levenshtein distance to that of the comparative method shows that the Levenshtein classification is correct only 40% of the time. Standardizing the orthography increases the performance, but only to a maximum of 65% accuracy within language subgroups. The accuracy of the Levenshtein classification decreases rapidly with phylogenetic distance, failing to discriminate homology and chance similarity across distantly related languages. This poor performance suggests the need for more linguistically nuanced methods for automated language classification tasks

    POLLEX-Online: The Polynesian Lexicon Project Online

    Full text link

    Basic vocabulary and Bayesian phylolinguistics

    No full text
    Donohue et al.’s critique of our work on the origins and spread of the Austronesian language family is marred by misunderstandings of our approach. We respond to these by noting that our Bayesian phylogenetic approach: (1) distinguishes between retentions and innovations probabilistically, (2) focuses on basic vocabulary not ‘the lexicon’, (3) eliminates known loanwords, (4) produces results that are congruent with the results of the comparative method and conflict with the scenarios requiring unprecedented amounts of language shift postulated by Donohue et al

    A lexicostatistical study of the Khasian languages: Khasi, Pnar, Lyngngam, and War

    No full text
    This paper presents the results of lexicostatistical, glottochronological, and Bayesian phylogenetic analyses of a 200 word data set for Standard Khasi, Lyngngam, Pnar and War. Very few works have appeared on the subject of the internal classification of the Khasian branch of Austroasiatic, leaving the existing reference literature disappointingly incomplete. The present analysis supports both the strong identity of Khasian as a unitary branch, with an internally nested branching structure that fits neatly with known historical, geographical and linguistic facts. Additionally, lexically based dating methods suggest that the internal diversification of Khasian began roughly between 1500 and 2000 years ago.Copyright Information: Copyright for this paper vested in the authors. Released under Creative Commons Attribution Licens

    Incorporating contextual audio for an actively anxious smart home

    Full text link

    Why do religious cultures evolve slowly? The cultural evolution of cooperative calling and the historical study of religions

    No full text
    Collective representations are the result of an immense cooperation, which stretches out not only into space but into time as well; to make them, a multitude of minds have associated, united and combined their ideas and sentiments: for them, long generations have accumulated their experience and their knowledge. A special intellectual activity is therefore concentrated in them, which is infinitely richer and complexer than that of the individual. (Émile Durkheim, Elementary Forms of the Religious Life, [1912] 1965: 29)The languages and folkways of ancient peoples hold little relevance for us, except in one respect: the religions of the ancient world remain our religions. Though religions change, core features of the scriptures and rituals of the world's most popular religious traditions appear to have been conserved with remarkably high fidelity. We explain slow religious change from how religion facilitates cooperation at large social scales. At the end, we clarify how historians of religion, in collaboration with psychologists and computational biologists, might test and improve explanations such as ours.This research was supported by the John F. Templeton Foundation (Testing the Functional Roles of Religion in Human Society, no. 28745), the Royal Society of New Zealand ("e Cultural Evolution of Religion, no. 11-UOA-23

    Population structure and cultural geography of a folktale in Europe.

    No full text
    Despite a burgeoning science of cultural evolution, relatively little work has focused on the population structure of human cultural variation. By contrast, studies in human population genetics use a suite of tools to quantify and analyse spatial and temporal patterns of genetic variation within and between populations. Human genetic diversity can be explained largely as a result of migration and drift giving rise to gradual genetic clines, together with some discontinuities arising from geographical and cultural barriers to gene flow. Here, we adapt theory and methods from population genetics to quantify the influence of geography and ethnolinguistic boundaries on the distribution of 700 variants of a folktale in 31 European ethnolinguistic populations. We find that geographical distance and ethnolinguistic affiliation exert significant independent effects on folktale diversity and that variation between populations supports a clustering concordant with European geography. This pattern of geographical clines and clusters parallels the pattern of human genetic diversity in Europe, although the effects of geographical distance and ethnolinguistic boundaries are stronger for folktales than genes. Our findings highlight the importance of geography and population boundaries in models of human cultural variation and point to key similarities and differences between evolutionary processes operating on human genes and culture

    Scintillation in the Circinus Galaxy water megamasers

    Full text link
    We present observations of the 22 GHz water vapor megamasers in the Circinus galaxy made with the Tidbinbilla 70m telescope. These observations confirm the rapid variability seen earlier by Greenhill et al (1997). We show that this rapid variability can be explained by interstellar scintillation, based on what is now known of the interstellar scintillation seen in a significant number of flat spectrum AGN. The observed variability cannot be fully described by a simple model of either weak or diffractive scintillation.Comment: 10 pages, 5 figures. AJ accepte

    How Accurate and Robust Are the Phylogenetic Estimates of Austronesian Language Relationships?

    Get PDF
    We recently used computational phylogenetic methods on lexical data to test between two scenarios for the peopling of the Pacific. Our analyses of lexical data supported a pulse-pause scenario of Pacific settlement in which the Austronesian speakers originated in Taiwan around 5,200 years ago and rapidly spread through the Pacific in a series of expansion pulses and settlement pauses. We claimed that there was high congruence between traditional language subgroups and those observed in the language phylogenies, and that the estimated age of the Austronesian expansion at 5,200 years ago was consistent with the archaeological evidence. However, the congruence between the language phylogenies and the evidence from historical linguistics was not quantitatively assessed using tree comparison metrics. The robustness of the divergence time estimates to different calibration points was also not investigated exhaustively. Here we address these limitations by using a systematic tree comparison metric to calculate the similarity between the Bayesian phylogenetic trees and the subgroups proposed by historical linguistics, and by re-estimating the age of the Austronesian expansion using only the most robust calibrations. The results show that the Austronesian language phylogenies are highly congruent with the traditional subgroupings, and the date estimates are robust even when calculated using a restricted set of historical calibrations

    CLICS² An Improved Database of Cross-Linguistic Colexifications : Assembling Lexical Data with the Help of Cross-Linguistic Data Formats

    Get PDF
    International audienceThe Database of Cross-Linguistic Colexifications (CLICS), has established a computer-assisted framework for the interactive representation of cross-linguistic colexification patterns. In its current form, it has proven to be a useful tool for various kinds of investigation into cross-linguistic semantic associations , ranging from studies on semantic change, patterns of conceptualization, and linguistic pale-ontology. But CLICS has also been criticized for obvious shortcomings, ranging from the underlying dataset, which still contains many errors, up to the limits of cross-linguistic colexification studies in general. Building on recent standardization efforts reflected in the Cross-Linguistic Data Formats initiative (CLDF) and novel approaches for fast, efficient, and reliable data aggregation, we have created a new database for cross-linguistic colexifications, which not only supersedes the original CLICS database in terms of coverage but also offers a much more principled procedure for the creation, curation and aggregation of datasets. The paper presents the new database and discusses its major features
    • …
    corecore